by Jack Davis and David Awosoga
2023-11-23
## Warning: package 'rgl' was built under R version 4.3.2
## Warning: package 'knitr' was built under R version 4.3.1
Last time we talked about how to start a sports analytics project. Today we’re talking about how to finish one. Namely, what sort of deliverables could you do, and the basics of how to make them.
This talk comes to you in four parts: Deliverable options, Visualizations, Writing, and Reproducability
Deliverable options
Do it for fun and profit!
The profit isn’t going to come directly from ads or sales like it could from a more commercial, general audience blog. Instead, the entire blog is an advertisement for your skills and services. It’s a way to passively promote yourself, and a way to refer to or look at old projects when you’re not at your home computer.
It’s very good writing practice. After a while, you’ll find yourself writing better posts with less effort, and you can always go back and hide your oldest posts if you think they’re no longer optimally showing the world your skills.
“Although you can get ad revenue from blogging, the volume of people going to a stats blog usually isn’t worth much. The value you get is from showing off your expertise to the world.” – Éric Grenier http://www.threehundredeight.com/
Eric Cai, the Chemical Statistician http://www.statsblogs.com/author/eric-cai-the-chemical-statistician/
Andrew Gelman http://andrewgelman.com/
R-Bloggers https://www.r-bloggers.com/
Jack Davis https://www.stats-et-al.com
Academic posters are like shorter versions of blog posts. Use the Better Poster R Markdown template to make one that is instantly readable.
Industry papers are extended versions of blog posts, which can be as long as academic posters, but without the formality or need to do a peer review.
arXiv is a pre-publisher for getting research onto the internet more quickly than with a traditional academic publication.
Video essays are a great way to get a lot of eyeballs on your work, while also getting a trickle of ad money. You can make one from a collection of visualizations while verbally narrating the words. However, you need to either know video editing on top of the many other skills a sports analytics person needs, or you need a collaborator.
Dorktown/chart party does sports analytics video using Google Sattelite annotations.
Scorigami:
https://www.youtube.com/watch?v=9l5C8cGMueY
How to emulate this style:
https://www.youtube.com/watch?v=MfM7cqOlgds
Athletic interest does video essays on the business of sport.
library(ggplot2)
library(scatterplot3d)
library(plot3D)
library(mandelbrot)
wloo_cols = c("#ffffaa", "#ffea3d", "#ffd54f", "#e4b429", # yellow
"#dfdfdf", "#a2a2a2", "#787878", "#000000", #grey to black
"#ffbeef", "#ff63aa", "#df2498", "#c60078") #pink
wloo_cols_2 = wloo_cols[8:1]
wloo_cols_100_2 = c(colorRampPalette(wloo_cols[8:5], bias=0.25)(10),
colorRampPalette(wloo_cols[4:1])(90))
mb4 <- mandelbrot(xlim = c(-0.83310, -0.833055),
ylim = c(0.20575, 0.205795),
resolution = list(x = 1400, y = 800),
iterations = 1000)
df2 <- as.data.frame(mb4)
g <- ggplot(df2, aes(x = x, y = y, fill = value)) +
geom_raster(interpolate = TRUE) + theme_void() +
scale_fill_gradientn(colours = wloo_cols_100_2, guide = "none")
plot(g)From Data to Viz (previously the GGplot gallery) has a lot of generic GGplot material for you to copy/paste and modify
ggplot gallery: https://www.data-to-viz.com/
Hockeyviz gives detailed visualizations of each NHL game. Patreon members get more details and get them sooner.
Hockeyviz: https://www.hockeyviz.com/ , https://www.hockeyviz.com/game/2023020243
rgl is the R interface to the OpenGL library. (See: https://en.wikipedia.org/wiki/OpenGL)
We are using a “hook” in knitr to use WebGL, which is
OpenGL for webpages, which is why, among other reasons, these notes are
going to be in HTML slidy format.
See: https://bookdown.org/yihui/rmarkdown-cookbook/rgl-3d.html for details on this hook.
The R code block on this slide has webgl set to TRUE to allow for embedding. It won’t work in the R studio preview viewer that pops up after you knit something, but it WILL work if you load the slides in something like Firefox. (Or Chrome, I guess..)
{r, webgl=TRUE}
If everything works, you should get a 3d scatterplot (made with rgl’s plot3d() function. Note the lower case d, where the dots are drawn in a rainbow arranged along the x-axis. You should be able to rotate the image by clicking and dragging.
(The lower case d is important because there is also a plot3D with an uppercase D that comes from the plot3D package)
Some 3D plot functions in other packages can also use the
rgl package to make interactive graphs. The
scatter3d function in the car package can use
it to make interactive plots.
This r code block also has webgl=TRUE.
For more information on using rgl, see the following
documentation:
Papers are big and intimidating to write, and imagining everything involving in the writing of one is pretty much impossible, at least for modern papers. Instead, it’s much easier to think about and write small parts of a paper at a time, and then do any necessary synthesis at the end.
The Chicago Guide to Writing about Multivariate Analysis has some good discussions on describing statistical results, as well as the principles of tables and visualizations.
https://press.uchicago.edu/ucp/books/book/chicago/C/bo15506942.html
The Chicago Guide to Writing About Numbers is a shorter, more general version of “Multivariate Analysis”.
https://press.uchicago.edu/ucp/books/book/chicago/C/bo19910133.html
Your best approach may be to find writing examples that both good and easy to emulate. Visualizing baseball is a short, easy, read on baseball analytics with lots of ggplot output.
https://www.goodreads.com/book/show/57496821-visualizing-baseball
Anything from the Hockey Abstract is equally good. It’s a collection of short analyses of NHL hockey. Itself being an emulation of the much older Baseball Abstract series.
http://www.hockeyabstract.com/statshot
Other honourable mentions for good books to read for inspiration and examples of what you could do:
Basketball Data Science in R (more technical, closely linked to companion package) https://www.routledge.com/Basketball-Data-Science-With-Applications-in-R/Zuccolotto-Manisera/p/book/9781138600799
Squares & Sharps, Suckers & Sharks / Monte Carlo or Bust (Gambling, less technical, but still rigourous) https://www.goodreads.com/book/show/30167627-squares-and-sharps-suckers-and-sharks
Soccernomics (Even less technical, general interest but good example for talking about business) https://www.goodreads.com/book/show/6617185-soccernomics